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Abstract —Providing an adequate long-term participation in¬ 
centive is important for a participatory sensing system to main¬ 
tain enough number of active users (sensors), so as to collect a 
sufficient number of data samples and support a desired level of 
service quality. In this work, we consider the sensor selection 
problem in a general time-dependent and location-aware partici¬ 
patory sensing system, taking the long-term user participation 
incentive into explicit consideration. We study the problem 
systematically under different information scenarios, regarding 
both future information and current information (realization). 
In particular, we propose a Lyapunov-based VCG auction policy 
for the on-line sensor selection, which converges asymptotically 
to the optimal off-line benchmark performance, even with no 
future information and under (current) information asymmetry. 
Extensive numerical results show that our proposed policy out¬ 
performs the state-of-art policies in the literature, in terms of both 
user participation (e.g., reducing the user dropping probability by 
25 % ~ 90 %) and social performance (e.g., increasing the social 
welfare by 15 % ~ 80 %). 

I. Introduction 
A. Background and Motivations 

The proliferation of mobile devices (e.g., smartphones) 
with rich embedded sensors has led to revolutionary new sens¬ 
ing paradigm, often known as Participatory Sensing |jl|-|[^, 
in which mobile users voluntarily participate in and actively 
contribute to sensing system, using their carrying smartphones. 
Due to the low deploying cost and high sensing coverage, this 
new paradigm has attracted a wide range of applications in 
environment, infrastructure, and community monitoring (e.g., 
air pollutio n |[4| -|6|, wireless signal strengths |[7j-|[^, road 
traffic pQ|-||12|, and parking Gil, G3)- 

A typical participatory sensing system architecture usually 
consists of a service platform (also called service provider) 
residing in the cloud and a collection of mobile smartphone 
users G6}-G3- service provider launches a set of sensing 
tasks with different sensing requirements for different pur¬ 
poses, and mobile users actively subscribe to (participate in) 
one or multiple sensing task(s). In this work, we focus on 
an important type of participatory sensing scheme called the 
server-initiated sensing, where the service provider selects a 
specific set of participating smartphones to perform the sensing 
task, depending on the spatio-temporal data requirement of the 
sensing task and the geographical locations of the participating 
users as well as their sensing capabilities. Comparing with 
the user-initiated sensing scheme (where users actively decide 
when and where to sense), the server-initiated sensing scheme 
gives more control to the service provider to decide when and 
where to collect the data at what costs, hence can better fit the 
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requirements of sensing tasks. Clearly, the success of such a 
sensing system strongly relies on the active participations of 
users as well as their willingnesses to contribute their sensing 
capability and resource to the sensing tasks. 

Although many participatory sensing applications have 
been proposed in 0-(E)> they simply assume that users 
voluntarily participate in the system to perform sensing tasks. 
In reality, however, users may not be willing to participate in 
the sensing system, as this will incur extra operational cost 
(e.g., the battery energy expenditure and the transmission ex¬ 
pense). Moreover, many sensing tasks are location-aware and 
time-dependent, and involve spatio-temporal context. Sharing 
sensing data tagged with spatio-temporal context may reveal a 
lot of personal and sensitive information, which poses potential 
privacy threats to the participating users (ID- All of these bring 
the incentive issue to the fore. 

Several recent works have been devoted to the incentive 
mechanism design issue in participatory sensing, mainly using 
pricing and auction p7|-|[25|. Most of them focus on com¬ 
pensating the user’s direct sensing cost when being chosen 
as a sensor to perform a particular sensing task (e.g., in 
GD -|[23|), which we call the short-term sensing incentive. 
In practice, however, we find that the users participating in 
a sensing task may suffer certain indirect cost even when 
not performing the sensing task^j In this case, the short-term 
sensing incentive may not be enough to guarantee the long¬ 
term continuous participations of users. Intuitively, if a user is 
rarely selected as a sensor (hence hardly receives the short-term 
sensing incentive), the user may lose the interest in continuous 
participation and decide to drop out of the sensing system (e.g., 
shut down the sensing app on his smartphone). Without an 
adequate number of users participating in the system, however, 
the service provider may not be able to collect a sufficient 
number of sensing data to support a desired service quality 
(e.g., miss the road traffic informaiton in some areas). 

To the best of our knowledge, p4| and p5| are the only 
results that explicitly study the long-term participation in¬ 
centive in participatory sensing. To stimulate the continuous 
participation of users, Lee et al. in p4| and p5| introduce 
a virtual credit for lowering the bids of users who lost in 
the previous round of auction, hence increasing their winning 
probabilities in future auction rounds. However, they consider 
neither the truthfulness, nor the optimality of the proposed 
auction. In this work, we will study the long-term participatory 
incentive, joint with the short-term sensing incentive, with 
rigorous truthfulness and optimality analysis. 


^Eor example, in a location-aware sensing task, users need to periodically 
report their locations to the service provider before the latter makes the sensor 
selection decision, which incurs certain energy and transmission cost. 
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Fig. 1. Location-Aware Participatory Sensing Model. 

B. Solutions and Contributions 

Specifically, we consider a general location-aware, time- 
dependent participatory sensing system, where the data in 
different time (slots) and/or locations may have different values 
for the sensing tasks. Each participating user has the potential 
to sense a specific region (at a certain sensing cost) in a specific 
time slot, depending on his location and mobility pattern. Fig.[2 
illustrates a snapshot of such a sensing system (in a particular 
time slot), where the sensing region of each user is denoted 
by the shadow area around the user. In such a system, the 
service provider selects (allocates) users as sensors to perform 
sensing tasks slot by slot. We focus on the following sensor 
selection/allocation problem for the service provider: 

• Which users should be selected as sensors in each time 
slot, aiming at maximizing the social welfare and ensuring the 
long-term participation incentive of users? 

The problem is challenging due to the following reasons. First, 
the overlap of different users’ sensing regions makes their 
sensing activities possibly redundant (hence partially “conflict” 
with each other). Second, the long-term participation incentive 
of users makes the sensor allocations in different time slots 
coupled. Based on the above, our model and problem formula¬ 
tion capture the following important features of a participatory 
sensing system: (i) long-term participation incentive, (ii) time- 
dependent and location-aware sensing requirement, and (iii) 
partial conflicting sensing activity. As far as we know, this is 
the first work that systematically studies a participatory sensing 
problem with all of the above features. 

We solve the above sensor selection problem under differ¬ 
ent information scenarios, regarding both future information 
(i.e., complete, stochastic, or no future information) and cur¬ 
rent information (i.e., symmetric or asymmetric). Specifically, 
with complete or stochastic future information, we formulate 
and solve an off-line sensor selection problem as benchmark 
(where we assume that the current information is always 
symmetric). With no future information, we formulate and 
solve an on-line sensor selection problem: (i) under informa¬ 
tion symmetry, we propose a Lyapunov-based on-line sensor 
selection policy (Policy 1), which converges to the optimal 
off-line benchmark asymptotically; and (ii) under information 
asymmetry, we propose a Lyapunov-based VCG auction policy 
(Policy 2), which is truthful, and meanwhile achieves the same 


TABLE I. Main Results in This Paper 


Future Info. 

Current Info. 

Solution 

Performance 

Section 

Complete / 
Stochastic 

Symmetric 

Off-line 

Optimal 

(Benchmark) 


III 


No Info 

Symmetric 

On-line Policy 1 

Asymptotic Opt. 


TV 


No Info 

Asymmetric 

On-line Policy 2 

Asymptotic Opt. 


V 



asymptotically optimal performance as in Policy l|^ 

It is important to note that the key contributions of this 
work are not on the Lyapunov framework itself, but rather, 
on the novel problem formulation and solution techniques. For 
more clarity, we list the main results in Table |Ij and summarize 
the key contributions as follows. 

• Novel Model and Problem Formulation: We study a gen¬ 
eral time-dependent and location-aware participatory sensing 
system, taking into consideration the important but less studied 
issue of long-term user participation incentive. We propose a 
simple yet representative formulation based on the allocation 
probability of each user to capture such an incentive. 

• Multiple Information Scenarios: We study the optimal 
sensor selection problem under different information scenarios. 
In particular, we propose on-line sensor selection policies that 
converge to the asymptotically optimal performance, even with 
no future information and under information asymmetry. 

• Performance Evaluations: We compare the proposed 
on-line policies with the state-of-art policies, and show our 
proposed policies outperform the existing ones significantly, 
in terms of both user participation and social performance: 
(i) Comparing with the RADP-VPC policy proposed in p4| 

S , our policies can reduce the user dropping probability by 
) ^ 50%, and increase the social welfare by 15% ^ 40%; 
(ii) Comparing with the Greedy/Random policy widely used in 
existing systems (e.g., pQ|), our policies can reduce the user 
dropping probability by 70% ^ 90%, and increase the social 
welfare by 65% ^ 80%. 

II. System Model 

We consider a location-aware participatory sensing system 
with a service provider (SP) and a set A/" = of 

mobile smartphone users (participating in the system). The SP 
wants to collect specific data in a certain area (via participating 
users’ smartphones) for specific tasks. Mobile users move 
randomly in and out of the desirable sensing area according to 
certain mobility pattern. As shown in Fig. each user has the 
potential to sense a specific region in a certain period according 
to his location and mobility, and the whole sensing area A is 
divided into a set X = {1,...,/} of gridsEach grid AiA ^ 
is associated with a weight Wi [t ], denoting the value of the data 
associated with grid Ai in each slot t. Obviously, such a data 
value is location-aware and time-dependent. 

The SP requests data slot by slot, where each time slot 
ranges from minutes to hours depending on the temporal data 
requirements of tasks. We consider the sensing operation in a 
long period consisting of a set T = {1, ...,T} of T slots. At 

^Several recent works also studied the on-line policy for sensing task 
allocation, considering the uncertainty of user arrival (^, However, 
these works did not consider the user long-term participation incentive. 

^A grid is the minimum unit of sensing area at a particular location, e.g., 
a square of 100m x 100m, associated with a single data in a particular time. 
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the beginning of each time slot, the SP selects (allocates) a set 
of users to perform the sensing task in that time slot, depending 
on factors such as the user locations and the data values. Let 
Xn[t] G {0,1} denote whether a user n is selected as sensor in 
slot t, and x[t] = Vn G Af} denote the sensor selection 

vector in slot t. We further denote Xn = {xn[f],Vt G T} as 
the allocation vector of user n in all time slots. 


A. Mobile User Modeling 

1) Sensing Region: Each mobile user has a certain sensing 
range in each time slot, mainly depending on his location and 
mobility pattern. In Fig. the sensing region of each user is 
illustrated by the shadow area around the user. Let Zn^i[t] G 
{0,1} denote whether a grid Ai is located in the sensing range 
of user n in slot t. Then, the total sensing region of user n in 
each slot t can be defined as a vector: [t] = {z^^i [t ], Vi G X}. 
Note that when user n moves out of the desirable sensing area 
in time slot t, we can simply define: Zn,i[t] = 0,Vi G I. As 
mobile users move randomly, the sensing region [t] of each 
user n also changes randomly across time slots. 

2 ) Sensing Value: When a user n is selected as sensor in a 
slot t, i.e., Xn[t] = 1, he performs the following sensing task: 
collect, process, and transmit all of the data within his sensing 
region z^[t] to the SP. This generates a sensing value Vn[t] 
equal to the sum of weights of all grids within Zn[t]: 

Vn [t] = Xn [t] ■ Zn,i [t] - Wilt]. ( 1 ) 

iel 


3) Sensing Cost: When performing sensing tasks, users 
incur extra operational cost (called sensing cost) due to, for 
example, the energy expenditure and the transmission expense. 
Let Cn[t] denote the total sensing cost of user n in slot t 
(including all potential expense used for collecting, processing, 
and transmitting the data within z^[t] to the SP). Obviously, 
such a sensing cost is user- and time-dependent. 

Due to this direct sensing cost, users may be reluctant to 
perform sensing tasks without sufficient incentives. To avoid 
this, in each time slot, the SP will pay certain monetary or non¬ 
monetary compensation (which we call the short-term sensing 
incentive) to those users who are selected as sensors. Later we 
will show that this type of incentive can be easily addressed 
through, for example, a first-degree price discrimination flSl 
or a truthful auction in each time slot. 

4) Participatory Constraint: As discussed in Section 
users may suffer certain indirect cost even when not perform¬ 
ing sensing tasks, induced by, for example, reporting location / 
mobility information or running sensing apps. Thus, if a user 
is rarely selected as a sensor (hence hardly receives the short¬ 
term sensing incentive), he may gradually lose the interest in 
continuous participation, and decide to no longer participate 
in the system (in this case, we say the user drops). 

As shown in p4| and | [^ , such a long-term participation 
incentive strongly depends on the user’s Return on Investment 
(ROI). In this work, instead of directly estimating the total 
return and total investment, we use a simple yet representative 


indicator to reflect the user ROI: Allocation Prohahility^i.t., 
the probability of each user being selected as sensor. Namely, 
we consider such a scenario where each user n will drop out 
of the sensing system, if his allocation probability (of being 
selected as sensor) is smaller than a specific threshold 
called the dropping threshold of user n. Therefore, to ensure 
the active participation of users, the allocation probability of 
each user should be no smaller than his dropping threshold, 
which we call the user participatory constraint: 

-On < (^n(a3„) = 2 yy VneAf, (2) 

ter 

where dn{xn) is the time average allocation probability of user 
n, depending on the allocations of user n in all slots. 


B. Service Provider Modeling 


Given the set M of mobile users participating in the system, 
the SP selects a subset of users as sensors in each time 
slot. We consider a non-commercial SP (e.g., a non-profit 
organization or a governmental department), whose primary 
goal is to maximize the total sensing value and minimize the 
total sensing cost in the entire time period, subjecting to the 
user participatory constraint in 

Given the allocation vector x[t] = Vn G JV} in slot 

t, the total sensing cost (in slot t) can be directly defined as 
the sum of all selected users’ sensing costs, i.e.. 


C[t] = ^n[t] ■ Cn[t]. (3) 

neJV 


The total sensing value (in slot t), however, may not be same 
as the aggregate sensing value of all selected users due to the 
overlap of their sensing regions. The key reason is that the 
same data collected by multiple users simultaneously can only 
generate value once. For convenience, let pi [t] denote whether 
a grid Ai is sensed by at least one mobile user, that is, 
r n 1 


Viit] = 


^ ^ Xfi [^] ' ^n,i [l'] : 

neJV 


(4) 


where = 1 if x > 1, and \x]^ = x if x < 1. Then, the 
total sensing value (in slot t) can be defined as follows: 


iei 


Obviously, if the sensing regions of all selected users do not 
overlap with each other, then yi[t] = 

[t] • Xn[t], i.e., the total sensing value is di¬ 
rectly the sum of all selected users’ sensing values. 


The social welfare generated in each slot t is the differ¬ 
ence between the total sensing value and sensing cost, i.e., 

S[t] = V[t] - C[t]. (6) 

The overall (average) social welfare in all time slots is 

= (7) 

teT ter 

where x = {xn[t],Vn G A/', t G T} = {x[t],\/t G T}. 


^Consider, for example, a user with an expected direct sensing cost ci, 
an expected indirect sensing cost C 2 , and an expected return r when being 
selected as a sensor. Then, an allocation probability r] directly corresponds to 
an expected ROI: —t—^^— • 

^ T7-(ci+C2) + (l-r7)-C2 
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C Information Scenario 

We will study the sensor selection problem in different net¬ 
work information scenarios. The network information consists 
of the weight (data value) of each grid, the sensing region and 
sensing cost of each user in each time slot. Formally, we define 
the network information in time slot t as: 

0[t] = {wi[t],7.n[t],Cn[t], Mi ^ M}. (8) 

Note that the sensing value Vn[t] of each user is not network 
information, as it is determined by Wi [t] and [t ]. 

Regarding the future network information, we consider 
the scenarios of complete future information, stochastic future 
information, and no future information, depending on whether 
and how much the SP knows regarding the future network 
information. Regarding the current network information (real¬ 
ization), we consider the scenarios of information symmetry 
and asymmetry, depending on whether the SP can observe the 
private information of users (e.g., the sensing cost). 

III. Off-line Sensor Selection Benchmark 

In this section, we study the sensor selection problem with 
complete future information and stochastic future information 
(as benchmarks). Note that in these benchmark cases, we 
assume the scenario of information symmetry (regarding the 
current network information), where the SP is able to observe 
all of the network information realized in each time slot. 

A. Complete Future Information 

With complete future information, the SP is able to deter¬ 
mine the sensor selections in all time slots jointly to maximize 
the overall social welfare. Thus, the SP’s problem is 

ter (-5) 

s.t. (a) Xn[t] G {0,1}, Mn e N'.Mt e T, 

(b) Dn < dn{Xn), Mfl G ff. 

Obviously, is an off-line allocation problem, and the solu¬ 
tion presents the explicit allocation of each user in each time 
slot in advance. Note that (|^ is a binary integer programming, 
and can be effectively solved by many classic methods, such 
as the branch-and-bound algorithm in ED 

It is easy to see that formulating and solving requires the 
complete future information, which is obviously impractical. 
Hence, we will study another benchmark solution based on the 
stochastic information in the next subsection. 

B. Stochastic Future Information 

With stochastic information only, the SP cannot decide the 
explicit allocation of each user in each time slot in advance, 
due to the lack of complete future information. Hence, in this 
case, we will focus on the expected social welfare maximiza¬ 
tion based on the stochastic information. 

Let Xn{0) G {0,1} denote whether a user n is selected 
as sensor under a particular information realization 0, x{0) = 
{xn{0),Mn G ff} denote the allocation vector of all users 
under 0, and Xn = {xn{0)^M0 G 0} denote the allocation of 


user n under all possible 0. Then, the expected social welfare 
maximization problem can be defined as follows 0 

max [ (V{0) — C{0)) ' f{0)d0 

- Jeee 

s.t. (a) XniO) G {0,1}, Vn G AT, V6> G 0, ^ ^ 

(b) F)yi ^ djiixjif Mn G ff , 

where 

• C{0) = • Cn{0) is the sensing cost under 0\ 

• V{0) = is sensing value under 0\ 

n{d) • Zn,i{0)] ^ indicates whether a grid 
Ai is sensed by at least one user under 0; 

• dn{xn) = f^^Q,Xn(d) • f{0)d0 is the average allocation 
probability of user n. 

Similarly, ( p^ is an off-line problem, and the solution 
defines a contingency plan that specifies the allocation of 
each user under each possible information realization 0 in 
advance. Note that ( p^ is an integer programming with an 
infinite number of decision variables (as 0 is continuous), 
which is non-convex and NP-hard. Nevertheless, by the linear 
programming relaxation, we can easily transform into a 
linear programming problem, and solve it by classic methods, 
e.g., the KKT analysis]^ 

Next we analyze the gap between the maximum social wel¬ 
fare with complete information (denoted by S°) derived from 
© and the maximum expected social welfare with stochastic 
information (denoted by S*) derived from Formally, 

Lemma 1. IfT^oo, then S'* ^ S°. 

Lemma indicates that as long as the total sensing period 
T is large enough, the social welfare loss induced by the loss 
of complete network information is negligible. Hence, both 
S° and S* will serve as the same benchmark for the on-line 
policies proposed in Sections |IV] and |V] 

It is notable that formulating and solving ( p^ still requires 
certain (stochastic) future information, which may not be 
available in practice. This motivates us to further study on-line 
policies not relying on any future information. 

IV. On-line Sensor Selection Policy 

In this section, we study the sensor selection problem with 
no future information. We propose an on-line sensor selection 
policy based on the Lyapunov optimization framework | [30| , 
which relies only on the current network information and past 
sensor selection history, while not on any future information. 
Meanwhile, it asymptotically converges to the benchmark with 
complete or stochastic future information proposed in Section 
Hm Note that we also assume the scenario of information 
symmetry in this section, and will further study the scenario 
of information asymmetry in the next section. 

A. Lyapunov Optimization Technique 

Lyapunov optimization p0| is a widely used technique for 
solving stochastic optimization problems with time average 

^0 is the feasible set of 6, i.e., the set of all possible network information 
realizations, and f{0) is the probability distribution function (pdf) of 6. 

^We leave the details in |3l| , as the method is standard. Moreover, solving 
the stochastic opitmalization problem is not the main contribution of this work. 
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constraints, such as the social welfare maximization problem 
0 in this work (with T -> oo), where the user participatory 
constraint (b) is the time average constraint. Hence, we intro¬ 
duce the Lyapunov optimization technique to solve the sensor 
selection problem 0 with no future information. 

1) Queue Definition: The key idea of Lyapunov optimiza¬ 
tion technique is to use the stability of queues to ensure that 
the time average constraints are satisfied. Following this idea, 
we first introduce a virtual queue {q^) for each user n. This 
virtual queue is used for buffering the virtual allocation request 
of each user. Here, we use the prefix “virtual” to denote that the 
request is not actually initiated by the user, but rather, it is used 
to reflect the requirement of the user participatory constraint. 
Namely, one virtual request represents that ''to satisfy the user 
participatory constraint, the user should be selected as sensor 
in one additional time slof\ Hence, the backlog of a virtual 
queue denotes the total number of virtual requests in the queue 
(which may not be an integer), i.e., the total number of addi¬ 
tional time slots that the user should be selected as sensor (in 
order to meet his participatory constraint). 

2 ) Queue Dynamics: With the above queue definition, each 
virtual request of user n will enter into the queue with a 
constant arrival rate of Dn. Let xl^[t] G {0,1} denotes 
whether user n is selected as sensor in time slot t (under certain 
sensor selection policy), and = dn{xl^) = 

denote the average allocation probability of user n. Intuitively, 
xl^ [t] = 1 implies that one virtual request of user n leaves the 
queue at slot t. Hence, the virtual request of user n will leave 
the queue with an average departure rate of dj^. 

Let denote the queue backlog of user n in slot t, and 
let = {g^, Vn G fif} denote the queue backlog vector of all 
users. For each user n, given the constant arrival of his virtual 
request and the allocation xl^[t] in each slot t (departure), we 
have the following dynamic equation for his virtual queue: 

= [ill - + Dn, ( 11 ) 

where [x]~^ = max(x,0). 

Next, we show how to connect the queue stability condition 
with the user participatory constraint in our problem. We say 
a virtual queue is rate stable, if 

g^ 

lim — = 0 with probability 1. 

t—YOO t 

By the queue stability theorem | |^ , a queue g^ is rate stable if 
and only if the arrival rate is no larger than the departure rate, 
i.e., Dn < di. This establishes the equivalence between the 
queue stability condition and the user participatory constraint. 
That is, to guarantee the user participatory constraint in our 
problem, we only need to ensure that the associated virtual 
queue is rate stable under the proposed policy. 

3) Queue Stability: Now we study the queue stability using 
the Lyapunov drift. We first define the Lyapunov function: 

neAf 

The Lyapunov drift in each slot t is defined as the change of 
Lyapunov function from one slot to the next, i.e., 

A[t] ^ J[t ^ 1] - J[t]. (13) 


By the Lyapunov drift theorem (Th. 4.1 in |[30|), if a policy 
greedily minimizes the Lyapunov drift A[t] in each slot t, then 
all backlogs are consistently pushed towards a low level, which 
potentially maintains the stabilities of all queues (i.e., ensures 
the participatory constraints of all users). 

4) Joint Queue Stability and Welfare Maximization: Next, 
we analyze the joint queue stability and objective optimization 
(i.e., expected social welfare maximization). By the Lyapunov 
optimization theorem (Th. 4.2 in p0|), to stabilize the queues 
while optimizing the objective, we can use such an allocation 
policy that greedily minimizes the following drift-plus-penalty: 

U[t]^A[t]-cP-{V[t]-C[t]), (14) 

where the (negative) social welfare, i.e., C[t] —V[t], is viewed 
as the penalty incurred on each slot f; 0 > 0 is a non-negative 
control parameter that is chosen to achieve a desirable tradeoff 
between the optimality and queue backlog. 

We further notice that directly minimizing the drift-plus- 
penalty defined in ( p^ may be difficult (partly because A [t] is 
a quadratic function). Hence, we will focus on minimizing a 
specific upper-bound of the drift-plus-penalty to achieve the 
joint stability and optimization. 

Next, we give such an upper-bound. Notice that 

AM < i E {xiitf + Dl + 2-ql-{Dr,- 4M)) 

neM (15) 

— + 'y Qn ' i^n ~ 4M)5 

neJV 

where B = J^neA/" ^ constant]^ Then, we have the 

following upper-bound for the drift-plus-penalty in ( p^ : 

m ^ • {V[t] - C[t]). (16) 

neJV 

By the Lyapunov optimization theory, it is easy to show that 
minimizing the above upper-bound of the drift-plus-penalty is 
equivalent to minimizing the drift-plus-penalty itself, in terms 
of the queue stability and objective optimization. 

Remark. Beyond following the standard Lyapunov opti¬ 
mization framework | [30| , our own contributions in this part 
are two-fold. First, we explicitly define the virtual queue, 
and analytically connect the user participatory constraint and 
the queue stability. This is the basis of applying Lyapunov 
optimization in our problem. Second, we propose an upper- 
bound ( p^ for the drift-plus-penalty, which is problem-specific 
and does not have a generic form suitable for all problems. The 
later on-line policy is based on this upper-bound. 

B. On-line Allocation Policy 

Based on the above theoretical analysis, we now design an 
on-line policy that aims at minimizing the drift-plus-penalty 
upper-bound in in each time slot. We present such a 
Lyapunov optimization based policy in Policy 


^The first inequality follows because {[q — x\^ -\- Dfi < -\-x^ -\- + 

2q • {D — x). The second inequality follows because < 1. 
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Policy 1: Lyapunov-based Policy (Information Symme¬ 
try)_ 

Initialization: q — q^\ 

for each time slot t = 0,1,T do 
Allocation Rule: 

x^[t] — argmax (v[t] — C[t] -h ^ ^ • Xn[t]^ 

Updating Rule: 

(fn^ = \(1 - A Dn, \/neJ\f 


1) Algorithm Design: The proposed Policy consists of 
an allocation rule and an updating rule in each time slot. 
The allocation rule determines the sensor selection (allocation) 
cc^[t] in each slot t, based on the current network information 
0[t\ and the current queue backlogs aiming at minimizing 
the upper-bound of drift-plus-penalty in ( p^ . The updating 
rule updates the queue bac klog s based on the current allocation 
result x^[t\ according to (11). It is easy to see that Policy 
relies only on the current network information and the past 
sensor selection history (captured by the queue backlogs), 
while not on any complete or stochastic future information. 

2 ) Optimality: Now we provide the optimality of Policy 
Let S^[t] denote the social welfare generated in each slot t, 
and S'* denote the maximum social welfare benchmark with 
the stochastic information (derived in Section ED- Formally, 

Theorem 1 (Optimality). 


ter 


The proof follows standard Lyapunov optimization theory 
p0| . By Theoremwe can easily find that Policyconverges 
to the maximum social welfare benchmark asymptotically, with 
a controllable approximation error bound (9(1/0). 

Intuitively, in Policy each virtual queue can be viewed as 
a regulation factor for lowering (regulating) the sensing cost 
of that user, and hence increasing the selection probability of 
that user. By the updating rule in Policy we can further 
obtain the following approximation for the queue backlog 

t-i 

Dn- 

k=0 

This implies that the time-attenuated queue backlog ^ can be 
used to approximate the gap between the required allocation 
probability (i.e., Dn) and the actual allocation probability till 

slot t till slot t (i.e., ^^ 9 . Notice that the queue backlog 

is bounded, hence the above gap goes to zero as t ^ 00 . 


V. Auction-based On-line Sensor Selection 
Policy 

In this section, we consider the asymmetric information 
scenario (regarding the current information), where the sensing 
cost of each user n realized in each time slot t (i.e., Cn[t]) is 

^This approximation is obtained by simply omitting the operation [.] + . 


Policy 2: Auction-based Policy (Information Asymmetry) 

Initialization: fi — \ 

for each time slot t = 0,1,..., T do 

Denote c^[t] as the bid of eaeh user n; 

Allocation Rule: 

x^[t] = argmax V[t] - Y Xn[t] ■ (4[t] - fZ) 

neN 

Payment Rule: 

Pn[t] = xi[t] ■ (vi[t] - Cijt] - St„[t] + ni) 
Updating Rule: 

= I ■ ■ i^*n - xiir +. vn € A/" 


his private information, and cannot be observed by the SP. 
Obviously, without this private sensing cost, the SP cannot 
implement the allocation rule in Policy 


A. Auction Mechanism Design 

We design an (reverse) VCG auction to address the credible 
information disclosure of users in each time slot, where the SP 
is the auctioneer (buyer), and users are the bidders (sellers). 
A standard VCG auction usually consists of an allocation 
rule (winner determination) and a payment rule. Our proposed 
auction mechanism involves a set of regulation factors (which 
are introduced for ensuring the user participatory constraint), 
hence includes an additional updating rule for the regulation 
factors. We present the detailed auction mechanism in Policy 
Next we will explain these rules in details. 

For convenience, we denote c'^[t] as each user n’s bid 
(report) regarding his sensing cost in each slot t, and as 
the regulation factor (similar as the virtual queue in Section 
\PV\ associated with each user n in each slot t. 

1) Allocation Rule: The allocation rule aims at maximiz¬ 
ing a regulated social welfare in each time slot: 

S[t] = V[t] - ^2 ^n[t] -Cnlt], 
neAf 

where Cn [t] = c'^ [t] — rh is the regulated sensing cost of user 
n, depending on both the user bid and the regulator factor. 
For convenience, we denote x^[t\ = {x|[t],Vn G J\f} as the 
allocation result in slot t (i.e., that maximizes S[t]). 

2) Payment Rule: The payment to user n in each time slot 
t is: (i) Pn[t] = 0 if user n is not selected, i.e., xlft] = 0, or 
(ii) if user n is selected, i.e., xlft] = 1 , then 

Pn[t] = v^[t] - CUlt] - SU[t] + (17) 

where V^[t] is the total sensing value under x^[t], = 

[t] is the total sensing cost except that of user 

n under x'^[t], and is the maximum achievable social 

welfare when excluding user n in the system. The first 3 terms 
correspond to the payment in a standard VCG auction. The last 
term is used to compensate the user cost regulation. 
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3) Updating Rule: Inspired by Policy we have: 

i •([</’■ Mn - xi[t]]"" + Dn'j , ^neM. 

The above updating rule is exactly same as that in Policy by 
simply viewing 0 • as Obviously, if users are truthful, 
then the above allocation/updating rule achieves the exactly 
same allocation and performance as in Policy 

B. Truthfulness and Optimality 

Theorem 2 (Truthfulness). The auction in Policy is truthful. 

Proof: Due to the space limit, we only show that each 
user n has no incentive to report (bid) a cost higher than his 
true cost|^ There are 4 possible outcomes: 

(a) {loss, loss}’, user n loses when bidding both truthfully and 
non-truthfully. He receives a zero payment in both strategies. 

(b) {win, loss}', user n wins (loses) when bidding truthfully 
(non-truthfully). He receives a smaller payment (i.e., zero) 
when bidding non-truthfully. 

(c) {loss, win}: user n loses (wins) when bidding truthfully 
(non-truthfully). This is practically impossible, as a user losing 
with a lower cost will never win when submitting a higher cost. 

(d) {win, win}: user n wins when bidding both truthfully and 

non-truthfully. We will show that user n receives the same 
payment in both strategies. First, the third term and the last 
term in are obviously identical in both strategies. Second, 
the first two terms in ^Vf) are also identical due to the following 
assert: If an allocation vector cc* maximizes the social welfare, 
then, excluding any user n and removing the grids sensed by 
user n (under x*), the remaining vector maximizes the 
social welfare in the remaining system. ■ 

Theorem 3 (Optimality). The auction in Policy^achieves the 
same asymptotically optimal social welfare as in Policy 

Proof: By the truthfulness given in Theorem together 
with the observation that the allocation and updating rules in 
Policy 1^ are exactly same as those in Policy we can prove 
the optimality immediately. ■ 

Remark. The above Policy is truthful only when users 
are myopic, in the sense that they only care about the current 
benefits in each time slot, while not anticipating the potential 
impacts of their bidding strategies on the future benefits. As a 
counter-example, a non-myopic user may report a large fake 
cost. By doing so, the SP will assign a large regulation factor to 
the user (in order to satisfy the user’s participatory constraint), 
which potentially increases the user’s future payment. We will 
study the model with non-myopic users in our future work. 

VI. Simulations 

In our simulations, we launch a participatory sensing ap¬ 
plication in a middle-scale virtual city with size 10km x 10km. 
The whole area is divided into 2500 grids, each corresponding 
to a square of 200m x 200m. Users move according to the 
random walk model: in each time slot, each user jumps from 

^The proof for “users are not willing to report costs lower than the true 
values” is similar. Please refer to (33 for details. 


Scenario (a) Scenario (b) 




Fig. 2. Illustration of Two Scenarios: (a) no hotspot and (b) one hotspot. 

the original location (grid) to another location (grid) randomly 
according to certain probability distribution. For illustrative 
purposes, we assume that the sensing region of each user in 
each time slot is a disk, centered at his location, with a radius 
randomly picked from [400m, 800m] We run the system in a 
period of 10, 000 time slots, which is long enough for obtaining 
stable outcomes under our adopted policies. 

A. Simulation Scenarios 

We consider two different simulation scenarios (a) and (b), 
depending on the different data value distributions in different 
areas, as shown in Fig. In scenario (a), there is no hotspot, 
and all grids are of the similar importance. Hence, the data 
value in different areas follows an i.i.d. distribution. In scenario 
(b), there is one hotspot, and the grids near to the centre of the 
hotspot are more important than those far from the hotspot, 
and hence have larger data values. Note that any scenario 
with multiple hotspots can be viewed as an intermediate case 
between (a) and (b). For fair comparison, we set the average 
data value in the whole area as 0.5 for both scenarios. 

B. Performance Comparisons 

Now we compare the performance of our proposed policy 
with the RADP-VPC policy proposed in p4| and p5| , a well- 
known policy that considers the participation incentiveTo 
draw a more convincing conclusion, we also compare our 
policy with those not considering the participation incentive, 
e.g., random selection and greedy selection (^th are widely 
used in practical applications such as Waze |lo|)p^ 

1) Dropping Probability: We first compare the user drop¬ 
ping probability under different policies. In this simulation, we 
set the dropping threshold as 0.5 for all users. Namely, if the 
allocation probability of a user is less than 0.5, the user will 
drop out of the system]^ Fig. illustrates the dynamics of 
user allocation probabilities as well as the dropping of users. 

^*^We will consider the more practical mobility model and sensing region 
scenario based on real data traces in our future work. 

^Tn the RADP-VPC policy, each user n’s cost is regulated by a virtual 
credit Un, and the virtual credit Vn is updated in the following way: (i) Vn = 
Un + a if user n is not selected as sensor in the previous slot, and (ii) 
Vn = 0 if user n is selected as sensor in the previous slot, where a > 0 
is a controllable parameter. Intuitively, a larger a can better satisfy the user 
participatory constraint, but may reduce the generated social welfare. 

^^In the greedy (random) policy, users are selected one by one in a 
descending (random) order of their generated social welfares. 

^^To reduce the “start effect” where a user may mistakenly drop in the first 
few slots (due to the low allocation probability in these slots), we assume that 
all users will be selected as sensor in the first 40 time slots. 
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Fig. 3. Allocation Probability Dynamics and User Dropping in Scenario (a) (the first row) and Scenario (b) (the second row). The dropping of a user is 
illustrated by the sudden decrease of his allocation probability. The percentage of dropping users is denoted by the blue shadow area. 




Fig. 4. Average Social Welfare under Different Policies in Scenario (a) (left) and Scenario (b) (right). Policy: (3a)-(3c) Lyapunov-based policy (cj) = {20,10, 5}) 
proposed in this work; (4a)-(4c) RADP-VPC policy {ol = {1, 0.5, 0.2}) proposed in and j^; (5) Random policy, (6) Greedy policy. 


We can see that, in scenario (a) (the first row), more than 70% 
of users drop under the greedy or random sensor selection 
policy, and around 25% of users drop under the RADP-VPC 
policy {a = 1); and in scenario (b) (the second row), more 
than 90% of users drop under the greedy or random sensor 
selection policy, and more than 50% of users drop under the 
RADP-VPC policy (a = 1). Our proposed policy, however, 
retains all users in both scenarios. 


We can further see that under the same policy (except our 
proposed one), more users drop in the scenario (b) than in 
scenario (a). The reason is as follows. In scenario (b) with one 
hotspot, most of the data value is concentrated in the hotspot 
area, and hence a large total sensing value can potentially be 
collected by a small number of users (located in the hotspot 
area). In scenario (a) with no hotspot, however, the data value 
is uniformly distributed in all areas, and hence a large total 
sensing value can be collected only by a large enough number 
of users (distributed in the whole area). Hence, to achieve the 
same level of sensing value, the number of sensors needed in 
scenario (b), on average, is smaller than that needed in scenario 
(a). Accordingly, the user allocation probability is lower, and 
hence more users drop, in scenario (b). 


2 ) Social Welfare: We then compare the average social 
welfare under different policies in Fig.[^ Curve (1) is the maxi¬ 
mum social welfare with no participatory constraint, and serves 
as an upper-bound of the maximum achievable social welfare 
with the participatory constraint. Curve (2) is the maximum 
social welfare benchmark (with the participatory constraint) 
with complete or stochastic future information derived in 
Sectioning The gap between curves (1) and (2) is called the 


incentive cost, which is used to guarantee the user long-term 
participation incentive. In our simulations, the incentive cost 
is approximately 6% in scenario (a) and 8% in scenario (b). 
Namely, the incentive cost is higher in scenario (b) than (a) 
due to the higher dropping probability. 

Curves (3a)-(3c) denote the social welfares achieved by our 
proposed Lyapunov-based Policy or(with 0 = 20, 10, and 
5, respectively). Our policy converges to the optimal bench¬ 
mark asymptotically, with very small approximation errors, 
e.g., {1%, 2%, 3%} in scenario (a) and {1.5%, 3%, 4.5%} 
in scenario (b). Note that the approximation error bound is 
controllable, via choosing different values of <p. We can further 
see that the benchmark (i.e., the maximum social welfare) is 
higher in scenario (b), as the same amount of sensing value 
can potentially be collected by fewer users in scenario (b) than 
in scenario (a). Accordingly, our proposed policy can achieve 
a higher social welfare in scenario (b). 

Curves (4a)-(4c) denotes the social welfares achieved by 
the RADP-VPC policy (with a = 1, 0.5, and 0.2, respectively) 
proposed in p4) and | [25| . Obviously, the performance of 
RADP-VPC largely depends on the choice of parameter a. In 
scenario (a), the social welfare gap between the RADP-VPC 
policy and our policy ranges from 15% (when Q; = l)to50% 
(when a = 0.2). In scenario (b), this gap increases to 40% 
and 75%. In fact, different from our policy or benchmark, the 
RADP-VPC policy achieves a worse performance in scenario 
(b), due to the higher dropping probability in scenario (b). This 
illustrates the importance of considering the long-term partici¬ 
patory incentive in a sensing system. 
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Fig. 5. Allocation Probability vs User Number 



Fig. 6. Social Welfare vs User Number 



Fig. 7. Social Welfare vs Dropping Threshold 


Finally, Curves (5) and (6) denotes the social welfares 
achieved by the random and greedy sensor selection policies. 
Neither policy considers the long-term participation incentive, 
hence users drop quickly (see Fig. and the social welfare 
decreases dramatically. The social welfare gap between the 
these two policies and our policy is larger than 60% in scenario 
(a) and 85% in scenario (b). Similarly to the RADP-VPC 
policy, these two policies both achieves a worse performance in 
scenario (b) than in (a), due to the higher dropping probability 
in scenario (b). Counter-intuitively, the greedy policy achieves 
a worse performance than the random policy, due to the 
higher user dropping probability in the greedy policy. This 
also illustrates the importance of considering the long-term 
participatory incentive in a sensing system. 

C. Impact of Participatory Constraint 

So far, we have shown in Fig.|^that our policy converges to 
the maximum social welfare benchmark asymptotically. Next, 
we show in Figs. m that how the participatory constraint 
affects this benchmark. We provide the results in scenario (a) 
only, as those in scenarios (b)-(d) are similar. 

Fig. [^illustrates the user allocation probability vs the num¬ 
ber of participating users, under different sensing costs (where 
CAV denotes the average ratio of unit sensing cost and unit 
data value). Obviously, the allocation probability decreases 
with both the sensing cost and the number of users (due to 
the partial conflict of their sensing activities). Note that in this 
result, there is no participatory constraint. Namely, users never 
drop, and in each time slot they will be selected based on the 
realized costs. This result is useful for explaining the different 
impacts of participatory constraint discussed later. 

Fig. [^illustrates the maximum social welfare (benchmark) 
vs the number of participating users N, under different drop¬ 
ping thresholds. We can see that when the dropping threshold 
is small (e.g., Dn < 0.35), the maximum social welfare always 
increases with the number of users, and the increase rate 
becomes larger with a smaller dropping threshold. When the 
dropping threshold is large (e.g., Dn > 0.4), the maximum 
social welfare first increases with the number of users, and 
then decreases with the number of users. This implies that in a 
sensing system with a mild or no participatory constraint (e.g., 
a small or zero dropping threshold), we can always increase 
the social welfare by involving more users into the sensing 
system. With a stringent participatory constraint (e.g., a large 
dropping threshold), however, involving more users may not 
always increase the social welfare, due to the high incentive 
cost to retain users in the system. 


Fig. [T] illustrates the maximum social welfare (benchmark) 
vs the dropping threshold Dn, with different numbers of users. 
Each dash line denotes the maximum social welfare without 
the participatory constraint (i.e., the upperbound in Fig.[^. We 
can see that the social welfare decreases with the dropping 
threshold, as a higher dropping threshold implies that more 
incentive cost is needed to retain the users in the system. We 
can further see that such a welfare degradation (induced by the 
incentive cost) is more severe with a larger number of users, 
as the total incentive cost increases with the number of users. 

VII. Conclusion 

In this work, we studied the optimal sensor selection prob¬ 
lem in a general time-dependent and location-aware participa¬ 
tory sensing system with the user long-term participatory con¬ 
straint. We proposed Lyapunov based on-line sensor selection 
(auction) policies, which do not rely on future information and 
achieve the optimal off-line benchmark performance asymp¬ 
totically. There are several possible extensions in the future 
work. An interesting one is to study the truthful mechanism 
when users are not myopic and can somehow anticipate the 
impact of their activities on the future time slots. 
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